Sparse Fisher Discriminant Analysis for Computer Aided Detection
نویسندگان
چکیده
We describe a method for sparse feature selection for a class of problems motivated by our work in Computer-Aided Detection (CAD) systems for identifying structures of interest in medical images. Typical CAD data sets for classification are large (several thousand candidates) and unbalanced (significantly fewer than 1% of the candidates are ”positive”). To be accepted by physicians, CAD systems must generalize well with extremely high sensitivity and very few false positives. In order to find the features that can lead to superior generalization, researchers typically generate a large number of experimental features for each candidate. The reason for such a large number of features is that there are no definitive methods for capturing the shape and image-based characteristics that correspond to the diagnostic features used by physicians to identify structures of interest in the image for example, cancerous polyps in a CT (computed tomography) volume of a patient’s colon. Thus several (100+) shape, texture, and intensity based features may be generated for each candidate at various levels of resolution. We propose a sparse formulation for Fisher Linear Discriminant (FLD) that scales well to large datasets; our method inherits all the desirable properties of FLD, while improving on handling large numbers of irrelevant and redundant features. We demonstrate that our sparse FLD formulation outperforms conventional FLD and two other methods for feature selection from the literature on both an artificial dataset and a real-world Colon CAD
منابع مشابه
A computer aided detection framework for mammographic images using fisher linear discriminant and nearest neighbor classifier
Today, mammography is the best method for early detection of breast cancer. Radiologists failed to detect evident cancerous signs in approximately 20% of false negative mammograms. False negatives have been identified as the inability of the radiologist to detect the abnormalities due to several reasons such as poor image quality, image noise, or eye fatigue. This paper presents a framework for...
متن کاملFisher Discriminant Analysis (FDA), a supervised feature reduction method in seismic object detection
Automatic processes on seismic data using pattern recognition is one of the interesting fields in geophysical data interpretation. One part is the seismic object detection using different supervised classification methods that finally has an output as a probability cube. Object detection process starts with generating a pickset of two classes labeled as object and non-object and then selecting ...
متن کاملSurvey on Perception of People Regarding Utilization of Computer Science & Information Technology in Manipulation of Big Data, Disease Detection & Drug Discovery
this research explores the manipulation of biomedical big data and diseases detection using automated computing mechanisms. As efficient and cost effective way to discover disease and drug is important for a society so computer aided automated system is a must. This paper aims to understand the importance of computer aided automated system among the people. The analysis result from collected da...
متن کاملComparison of Parametric and Non-parametric EEG Feature Extraction Methods in Detection of Pediatric Migraine without Aura
Background: Migraine headache without aura is the most common type of migraine especially among pediatric patients. It has always been a great challenge of migraine diagnosis using quantitative electroencephalography measurements through feature classification. It has been proven that different feature extraction and classification methods vary in terms of performance regarding detection and di...
متن کاملPerformance analysis of a new computer aided detection system for identifying lung nodules on chest radiographs
A new computer aided detection (CAD) system is presented for the detection of pulmonary nodules on chest radiographs. Here, we present the details of the proposed algorithm and provide a performance analysis using a publicly available database to serve as a benchmark for future research efforts. All aspects of algorithm training were done using an independent dataset containing 167 chest radiog...
متن کامل